文章速览 | 联邦学习 x CVPR'2023 (下)
本文是由白小鱼博主整理的CVPR 2023会议中,与联邦学习相关的论文合集及摘要翻译,上一次的整理详情请见文章速览 | 联邦学习 x CVPR'2023 (上)。
Authors: Ming Li; Qingli Li; Yan Wang
Conference: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Li_Class_Balanced_Adaptive_Pseudo_Labeling_for_Federated_Semi-Supervised_Learning_CVPR_2023_paper.html
Abstract: This paper focuses on federated semi-supervised learning (FSSL), assuming that few clients have fully labeled data (labeled clients) and the training datasets in other clients are fully unlabeled (unlabeled clients). Existing methods attempt to deal with the challenges caused by not independent and identically distributed data (Non-IID) setting. Though methods such as sub-consensus models have been proposed, they usually adopt standard pseudo labeling or consistency regularization on unlabeled clients which can be easily influenced by imbalanced class distribution. Thus, problems in FSSL are still yet to be solved. To seek for a fundamental solution to this problem, we present Class Balanced Adaptive Pseudo Labeling (CBAFed), to study FSSL from the perspective of pseudo labeling. In CBAFed, the first key element is a fixed pseudo labeling strategy to handle the catastrophic forgetting problem, where we keep a fixed set by letting pass information of unlabeled data at the beginning of the unlabeled client training in each communication round. The second key element is that we design class balanced adaptive thresholds via considering the empirical distribution of all training data in local clients, to encourage a balanced training process. To make the model reach a better optimum, we further propose a residual weight connection in local supervised training and global model aggregation. Extensive experiments on five datasets demonstrate the superiority of CBAFed. Code will be released.
abstractTranslation: 本文侧重于联邦半监督学习(FSSL),假设很少有客户端具有完全标记的数据(标记的客户端)并且其他客户端中的训练数据集完全未标记(未标记的客户端)。现有方法试图应对非独立同分布数据(Non-IID)设置带来的挑战。虽然已经提出了子共识模型等方法,但它们通常对未标记的客户端采用标准伪标记或一致性正则化,这些客户端很容易受到不平衡类分布的影响。因此,FSSL 中的问题仍然有待解决。为了寻求解决这个问题的根本方法,我们提出了类平衡自适应伪标签(CBAFed),从伪标签的角度研究 FSSL。在 CBAFed 中,第一个关键要素是用于处理灾难性遗忘问题的固定伪标记策略,我们通过在每一轮通信中未标记的客户端训练开始时让未标记数据的信息传递来保持固定集。第二个关键要素是我们通过考虑本地客户端所有训练数据的经验分布来设计类平衡自适应阈值,以鼓励平衡的训练过程。为了使模型达到更好的最优,我们进一步提出了局部监督训练和全局模型聚合中的残差权重连接。对五个数据集的大量实验证明了 CBAFed 的优越性。代码将被释放。
Notes: https://github.com/minglllli/CBAFed
Authors: Bo Li; Mikkel N. Schmidt; Tommy S. Alstrøm; Sebastian U. Stich
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Li_On_the_Effectiveness_of_Partial_Variance_Reduction_in_Federated_Learning_CVPR_2023_paper.html
Abstract: Data heterogeneity across clients is a key challenge in federated learning. Prior works address this by either aligning client and server models or using control variates to correct client model drift. Although these methods achieve fast convergence in convex or simple non-convex problems, the performance in over-parameterized models such as deep neural networks is lacking. In this paper, we first revisit the widely used FedAvg algorithm in a deep neural network to understand how data heterogeneity influences the gradient updates across the neural network layers. We observe that while the feature extraction layers are learned efficiently by FedAvg, the substantial diversity of the final classification layers across clients impedes the performance. Motivated by this, we propose to correct model drift by variance reduction only on the final layers. We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost. We furthermore provide proof for the convergence rate of our algorithm.
abstractTranslation: 跨客户端的数据异构性是联邦学习中的一个关键挑战。先前的工作通过对齐客户端和服务器模型或使用控制变量来纠正客户端模型漂移来解决这个问题。虽然这些方法在凸或简单的非凸问题中实现了快速收敛,但在深度神经网络等过度参数化模型中的性能不足。在本文中,我们首先回顾了深度神经网络中广泛使用的 FedAvg 算法,以了解数据异质性如何影响神经网络各层的梯度更新。我们观察到,虽然 FedAvg 有效地学习了特征提取层,但客户端最终分类层的实质性差异阻碍了性能。受此启发,我们建议仅在最后一层通过方差减少来校正模型漂移。我们证明,这在类似或更低的通信成本下显着优于现有基准。我们进一步证明了我们算法的收敛速度。
Notes: [PDF] (https://arxiv.org/abs/2212.02191)
Authors: SangMook Kim; Sangmin Bae; Hwanjun Song; Se-Young Yun
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Kim_Re-Thinking_Federated_Active_Learning_Based_on_Inter-Class_Diversity_CVPR_2023_paper.html
Abstract: Although federated learning has made awe-inspiring advances, most studies have assumed that the client's data are fully labeled. However, in a real-world scenario, every client may have a significant amount of unlabeled instances. Among the various approaches to utilizing unlabeled data, a federated active learning framework has emerged as a promising solution. In the decentralized setting, there are two types of available query selector models, namely 'global' and 'local-only' models, but little literature discusses their performance dominance and its causes. In this work, we first demonstrate that the superiority of two selector models depends on the global and local inter-class diversity. Furthermore, we observe that the global and local-only models are the keys to resolving the imbalance of each side. Based on our findings, we propose LoGo, a FAL sampling strategy robust to varying local heterogeneity levels and global imbalance ratio, that integrates both models by two steps of active selection scheme. LoGo consistently outperforms six active learning strategies in the total number of 38 experimental settings.
Notes:
[pdf](http://arxiv.org/abs/2303.12317)
[code](https://github.com/raymin0223/LoGo)
Authors: Meirui Jiang; Holger R. Roth; Wenqi Li; Dong Yang; Can Zhao; Vishwesh Nath; Daguang Xu; Qi Dou; Ziyue Xu
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Jiang_Fair_Federated_Medical_Image_Segmentation_via_Client_Contribution_Estimation_CVPR_2023_paper.html
Abstract: How to ensure fairness is an important topic in federated learning (FL). Recent studies have investigated how to reward clients based on their contribution (collaboration fairness), and how to achieve uniformity of performance across clients (performance fairness). Despite achieving progress on either one, we argue that it is critical to consider them together, in order to engage and motivate more diverse clients joining FL to derive a high-quality global model. In this work, we propose a novel method to optimize both types of fairness simultaneously. Specifically, we propose to estimate client contribution in gradient and data space. In gradient space, we monitor the gradient direction differences of each client with respect to others. And in data space, we measure the prediction error on client data using an auxiliary model. Based on this contribution estimation, we propose a FL method, federated training via contribution estimation (FedCE), i.e., using estimation as global model aggregation weights. We have theoretically analyzed our method and empirically evaluated it on two real-world medical datasets. The effectiveness of our approach has been validated with significant performance improvements, better collaboration fairness, better performance fairness, and comprehensive analytical studies.
abstractTranslation: 如何保证公平性是联邦学习(FL)中的一个重要课题。最近的研究调查了如何根据客户的贡献(合作公平)来奖励客户,以及如何在客户之间实现绩效的统一(绩效公平)。尽管在任何一个方面都取得了进展,但我们认为将它们放在一起考虑是至关重要的,以便吸引和激励更多不同的客户加入 FL 以获得高质量的全球模型。在这项工作中,我们提出了一种同时优化两种类型的公平性的新方法。具体来说,我们建议估计客户端在梯度和数据空间中的贡献。在梯度空间中,我们监控每个客户端相对于其他客户端的梯度方向差异。在数据空间中,我们使用辅助模型测量客户端数据的预测误差。基于这种贡献估计,我们提出了一种 FL 方法,即通过贡献估计进行联邦训练 (FedCE),即使用估计作为全局模型聚合权重。我们从理论上分析了我们的方法,并在两个真实世界的医学数据集上对其进行了实证评估。我们的方法的有效性已经通过显着的性能改进、更好的协作公平性、更好的性能公平性和全面的分析研究得到验证。
Notes:
[pdf] (http://arxiv.org/abs/2303.16520)
[code] (https://github.com/NVIDIA/NVFlare/tree/dev/research/fed-ce)
Authors: Fatih Ilhan; Gong Su; Ling Liu
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Ilhan_ScaleFL_Resource-Adaptive_Federated_Learning_With_Heterogeneous_Clients_CVPR_2023_paper.html
Abstract: Federated learning (FL) is an attractive distributed learning paradigm supporting real-time continuous learning and client privacy by default. In most FL approaches, all edge clients are assumed to have sufficient computation capabilities to participate in the learning of a deep neural network (DNN) model. However, in real-life applications, some clients may have severely limited resources and can only train a much smaller local model. This paper presents ScaleFL, a novel FL approach with two distinctive mechanisms to handle resource heterogeneity and provide an equitable FL framework for all clients. First, ScaleFL adaptively scales down the DNN model along width and depth dimensions by leveraging early exits to find the best-fit models for resource-aware local training on distributed clients. In this way, ScaleFL provides an efficient balance of preserving basic and complex features in local model splits with various sizes for joint training while enabling fast inference for model deployment. Second, ScaleFL utilizes self-distillation among exit predictions during training to improve aggregation through knowledge transfer among subnetworks. We conduct extensive experiments on benchmark CV (CIFAR-10/100, ImageNet) and NLP datasets (SST-2, AgNews). We demonstrate that ScaleFL outperforms existing representative heterogeneous FL approaches in terms of global/local model performance and provides inference efficiency, with up to 2x latency and 4x model size reduction with negligible performance drop below 2%.
abstractTranslation: 联邦学习 (FL) 是一种有吸引力的分布式学习范式,默认支持实时持续学习和客户端隐私。在大多数 FL 方法中,假定所有边缘客户端都具有足够的计算能力来参与深度神经网络 (DNN) 模型的学习。然而,在实际应用中,一些客户端的资源可能非常有限,只能训练一个小得多的本地模型。本文介绍了 ScaleFL,这是一种新颖的 FL 方法,具有两种独特的机制来处理资源异质性并为所有客户端提供公平的 FL 框架。首先,ScaleFL 通过利用早期退出找到最适合分布式客户端资源感知本地训练的模型,沿宽度和深度维度自适应地缩小 DNN 模型。通过这种方式,ScaleFL 提供了一种有效的平衡,既可以在具有各种大小的局部模型分割中保留基本特征和复杂特征以进行联邦训练,又可以为模型部署实现快速推理。其次,ScaleFL 在训练期间利用退出预测之间的自蒸馏,通过子网络之间的知识转移来改进聚合。我们对基准 CV(CIFAR-10/100、ImageNet)和 NLP 数据集(SST-2、AgNews)进行了大量实验。我们证明,ScaleFL 在全局/局部模型性能方面优于现有的代表性异构 FL 方法,并提供推理效率,延迟高达 2 倍,模型大小减少 4 倍,性能下降可忽略不计,低于 2%。
Notes:
[code] (https://github.com/git-disl/scale-fl)
Authors: Wenke Huang; Mang Ye; Zekun Shi; He Li; Bo Du
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Huang_Rethinking_Federated_Learning_With_Domain_Shift_A_Prototype_View_CVPR_2023_paper.html
Abstract: Federated learning shows a bright promise as a privacy-preserving collaborative learning technique. However, prevalent solutions mainly focus on all private data sampled from the same domain. An important challenge is that when distributed data are derived from diverse domains. The private model presents degenerative performance on other domains (with domain shift). Therefore, we expect that the global model optimized after the federated learning process stably provides generalizability performance on multiple domains. In this paper, we propose Federated Prototypes Learning (FPL) for federated learning under domain shift. The core idea is to construct cluster prototypes and unbiased prototypes, providing fruitful domain knowledge and a fair convergent target. On the one hand, we pull the sample embedding closer to cluster prototypes belonging to the same semantics than cluster prototypes from distinct classes. On the other hand, we introduce consistency regularization to align the local instance with the respective unbiased prototype. Empirical results on Digits and Office Caltech tasks demonstrate the effectiveness of the proposed solution and the efficiency of crucial modules.
abstractTranslation: 联邦学习作为一种保护隐私的协作学习技术显示出光明的前景。然而,流行的解决方案主要关注从同一域采样的所有私有数据。一个重要的挑战是当分布式数据来自不同的域时。私有模型在其他域上呈现退化性能(域转移)。因此,我们期望联邦学习过程后优化的全局模型在多个领域稳定地提供泛化性能。在本文中,我们提出了联邦原型学习(FPL)用于域转移下的联邦学习。核心思想是构建集群原型和无偏原型,提供丰富的领域知识和公平的收敛目标。一方面,与来自不同类的聚类原型相比,我们将样本嵌入更接近属于相同语义的聚类原型。另一方面,我们引入一致性正则化以使本地实例与相应的无偏原型对齐。Digits 和 Office Caltech 任务的实证结果证明了所提出解决方案的有效性和关键模块的效率。
Notes: https://github.com/WenkeHuang/RethinkFL
Authors: Chun-Mei Feng; Bangjun Li; Xinxing Xu; Yong Liu; Huazhu Fu; Wangmeng Zuo
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Feng_Learning_Federated_Visual_Prompt_in_Null_Space_for_MRI_Reconstruction_CVPR_2023_paper.html
Abstract: Federated Magnetic Resonance Imaging (MRI) reconstruction enables multiple hospitals to collaborate distributedly without aggregating local data, thereby protecting patient privacy. However, the data heterogeneity caused by different MRI protocols, insufficient local training data, and limited communication bandwidth inevitably impair global model convergence and updating. In this paper, we propose a new algorithm, FedPR, to learn federated visual prompts in the null space of global prompt for MRI reconstruction. FedPR is a new federated paradigm that adopts a powerful pre-trained model while only learning and communicating the prompts with few learnable parameters, thereby significantly reducing communication costs and achieving competitive performance on limited local data. Moreover, to deal with catastrophic forgetting caused by data heterogeneity, FedPR also updates efficient federated visual prompts that project the local prompts into an approximate null space of the global prompt, thereby suppressing the interference of gradients on the server performance. Extensive experiments on federated MRI show that FedPR significantly outperforms state-of-the-art FL algorithms with < 6% of communication costs when given the limited amount of local data.
abstractTranslation: 联邦磁共振成像 (MRI) 重建使多家医院能够在不聚合本地数据的情况下进行分布式协作,从而保护患者隐私。然而,不同 MRI 协议导致的数据异构性、局部训练数据不足和通信带宽有限不可避免地会影响全局模型的收敛和更新。在本文中,我们提出了一种新算法 FedPR,用于在 MRI 重建的全局提示的零空间中学习联邦视觉提示。FedPR 是一种新的联邦范式,它采用强大的预训练模型,同时仅学习和交流具有少量可学习参数的提示,从而显着降低通信成本并在有限的本地数据上实现有竞争力的性能。此外,为了应对数据异构性导致的灾难性遗忘,FedPR 还更新了高效的联邦视觉提示,将本地提示投影到全局提示的近似零空间中,从而抑制梯度对服务器性能的干扰。联邦 MRI 的大量实验表明,在给定有限数量的本地数据时,FedPR 的通信成本明显优于最先进的 FL 算法,通信成本低于 6%。
Notes:
[pdf](http://arxiv.org/abs/2303.16181)
[code](https://github.com/chunmeifeng/FedPR)
Authors: Jian-hui Duan; Wenzhong Li; Derun Zou; Ruichen Li; Sanglu Lu
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Duan_Federated_Learning_With_Data-Agnostic_Distribution_Fusion_CVPR_2023_paper.html
Abstract: Federated learning has emerged as a promising distributed machine learning paradigm to preserve data privacy. One of the fundamental challenges of federated learning is that data samples across clients are usually not independent and identically distributed (non-IID), leading to slow convergence and severe performance drop of the aggregated global model. To facilitate model aggregation on non-IID data, it is desirable to infer the unknown global distributions without violating privacy protection policy. In this paper, we propose a novel data-agnostic distribution fusion based model aggregation method called FedFusion to optimize federated learning with non-IID local datasets, based on which the heterogeneous clients' data distributions can be represented by a global distribution of several virtual fusion components with different parameters and weights. We develop a Variational AutoEncoder (VAE) method to learn the optimal parameters of the distribution fusion components based on limited statistical information extracted from the local models, and apply the derived distribution fusion model to optimize federated model aggregation with non-IID data. Extensive experiments based on various federated learning scenarios with real-world datasets show that FedFusion achieves significant performance improvement compared to the state-of-the-art.
abstractTranslation: 联邦学习已成为一种有前途的分布式机器学习范例,可以保护数据隐私。联邦学习的基本挑战之一是跨客户端的数据样本通常不是独立同分布的(非 IID),导致聚合全局模型收敛缓慢和性能严重下降。为了促进非 IID 数据的模型聚合,希望在不违反隐私保护策略的情况下推断未知的全局分布。在本文中,我们提出了一种新的基于数据不可知分布融合的模型聚合方法,称为 FedFusion,以优化非 IID 本地数据集的联邦学习,在此基础上,异构客户端的数据分布可以由多个虚拟融合的全局分布表示具有不同参数和权重的组件。我们开发了一种变分自动编码器 (VAE) 方法,以基于从本地模型中提取的有限统计信息来学习分布融合组件的最佳参数,并应用派生的分布融合模型来优化具有非 IID 数据的联邦模型聚合。基于各种联邦学习场景和真实世界数据集的大量实验表明,与最先进的技术相比,FedFusion 实现了显着的性能提升。
Notes: https://github.com/LiruichenSpace/FedFusion
Authors: Jiahua Dong; Duzhen Zhang; Yang Cong; Wei Cong; Henghui Ding; Dengxin Dai
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Dong_Federated_Incremental_Semantic_Segmentation_CVPR_2023_paper.html
Abstract:
abstractTranslation: 基于联邦学习的语义分割(FSS)通过对本地客户端的分散培训引起了广泛关注。然而,大多数 FSS 模型假设类别是预先固定的,因此在实际应用中本地客户端逐渐接收新类别而没有内存存储来访问旧类别时,会严重遗忘旧类别。此外,收集新课程的新客户可能会加入 FSS 的全球培训,这进一步加剧了灾难性遗忘。为了克服上述挑战,我们提出了一种遗忘平衡学习 (FBL) 模型,从客户端内和客户端间两个方面解决旧类的异构遗忘问题。具体来说,在通过自适应类平衡伪标签生成的伪标签的指导下,我们开发了遗忘平衡语义补偿损失和遗忘平衡关系一致性损失,以纠正具有背景偏移的旧类别的客户端内异构遗忘。它在本地客户端中执行平衡梯度传播和关系一致性蒸馏。此外,为了从客户端间的角度解决异构遗忘问题,我们提出了一个任务转换监视器。它可以识别隐私保护下的新类,并存储最新的旧全局模型以进行关系蒸馏。定性实验表明我们的模型相对于比较方法有了很大的改进。代码可在 https://github.com/JiahuaDong/FISS 获取。
Authors: Ka-Ho Chow; Ling Liu; Wenqi Wei; Fatih Ilhan; Yanzhao Wu
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url:https://openaccess.thecvf.com/content/CVPR2023/html/Chow_STDLens_Model_Hijacking-Resilient_Federated_Learning_for_Object_Detection_CVPR_2023_paper.html
Abstract: Federated Learning (FL) has been gaining popularity as a collaborative learning framework to train deep learning-based object detection models over a distributed population of clients. Despite its advantages, FL is vulnerable to model hijacking. The attacker can control how the object detection system should misbehave by implanting Trojaned gradients using only a small number of compromised clients in the collaborative learning process. This paper introduces STDLens, a principled approach to safeguarding FL against such attacks. We first investigate existing mitigation mechanisms and analyze their failures caused by the inherent errors in spatial clustering analysis on gradients. Based on the insights, we introduce a three-tier forensic framework to identify and expel Trojaned gradients and reclaim the performance over the course of FL. We consider three types of adaptive attacks and demonstrate the robustness of STDLens against advanced adversaries. Extensive experiments show that STDLens can protect FL against different model hijacking attacks and outperform existing methods in identifying and removing Trojaned gradients with significantly higher precision and much lower false-positive rates. The source code is available at https://github.com/git-disl/STDLens.
abstractTranslation: 联邦学习 (FL) 作为协作学习框架越来越受欢迎,用于在分布式客户端群体上训练基于深度学习的对象检测模型。尽管有优势,FL 还是容易受到模型劫持。攻击者可以通过在协作学习过程中仅使用少量受感染的客户端植入特洛伊木马梯度来控制对象检测系统的行为异常。本文介绍了 STDLens,这是一种保护 FL 免受此类攻击的原则性方法。我们首先调查现有的缓解机制,并分析它们因梯度空间聚类分析中的固有错误而导致的失败。基于这些见解,我们引入了一个三层取证框架来识别和排除特洛伊木马梯度,并在 FL 过程中恢复性能。我们考虑了三种类型的自适应攻击,并展示了 STDLens 对高级对手的鲁棒性。大量实验表明,STDLens 可以保护 FL 免受不同模型劫持攻击,并且在识别和去除特洛伊木马梯度方面优于现有方法,具有更高的精度和更低的误报率。源代码位于 https://github.com/git-disl/STDLens。
Notes:
[pdf] (http://arxiv.org/abs/2303.11511)
[code] (https://github.com/git-disl/STDLens)
Authors: Kangyang Luo; Xiang Li; Yunshi Lan; Ming Gao
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Luo_GradMA_A_Gradient-Memory-Based_Accelerated_Federated_Learning_With_Alleviated_Catastrophic_Forgetting_CVPR_2023_paper.html
Abstract: Federated Learning (FL) has emerged as a de facto machine learning area and received rapid increasing research interests from the community. However, catastrophic forgetting caused by data heterogeneity and partial participation poses distinctive challenges for FL, which are detrimental to the performance. To tackle the problems, we propose a new FL approach (namely GradMA), which takes inspiration from continual learning to simultaneously correct the server-side and worker-side update directions as well as take full advantage of server's rich computing and memory resources. Furthermore, we elaborate a memory reduction strategy to enable GradMA to accommodate FL with a large scale of workers. We then analyze convergence of GradMA theoretically under the smooth non-convex setting and show that its convergence rate achieves a linear speed up w.r.t the increasing number of sampled active workers. At last, our extensive experiments on various image classification tasks show that GradMA achieves significant performance gains in accuracy and communication efficiency compared to SOTA baselines. We provide our code here: https://github.com/lkyddd/GradMA.
abstractTranslation: 联邦学习 (FL) 已成为事实上的机器学习领域,并受到社区快速增长的研究兴趣。然而,由数据异质性和部分参与引起的灾难性遗忘给 FL 带来了独特的挑战,这对性能是不利的。为了解决这些问题,我们提出了一种新的 FL 方法(即 GradMA),它从持续学习中汲取灵感,同时纠正服务器端和工作端更新方向,并充分利用服务器丰富的计算和内存资源。此外,我们详细阐述了一种内存减少策略,使 GradMA 能够适应具有大量 worker 的 FL。然后,我们从理论上分析了 GradMA 在平滑非凸设置下的收敛性,并表明其收敛速度实现了线性加速 w.r.t 随着采样活跃工人数量的增加。最后,我们对各种图像分类任务的广泛实验表明,与 SOTA 基线相比,GradMA 在准确性和通信效率方面取得了显着的性能提升。我们在这里提供我们的代码:https://github.com/lkyddd/GradMA
Notes:
[pdf] (http://arxiv.org/abs/2302.14307)
[code] (https://github.com/lkyddd/gradma)
Authors: Dengsheng Chen; Jie Hu; Vince Junkai Tan; Xiaoming Wei; Enhua Wu
Conference : Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Url: https://openaccess.thecvf.com/content/CVPR2023/html/Chen_Elastic_Aggregation_for_Federated_Optimization_CVPR_2023_paper.html
Abstract: Federated learning enables the privacy-preserving training of neural network models using real-world data across distributed clients. FedAvg has become the preferred optimizer for federated learning because of its simplicity and effectiveness. FedAvg uses naive aggregation to update the server model, interpolating client models based on the number of instances used in their training. However, naive aggregation suffers from client-drift when the data is heterogenous (non-IID), leading to unstable and slow convergence. In this work, we propose a novel aggregation approach, elastic aggregation, to overcome these issues. Elastic aggregation interpolates client models adaptively according to parameter sensitivity, which is measured by computing how much the overall prediction function output changes when each parameter is changed. This measurement is performed in an unsupervised and online manner. Elastic aggregation reduces the magnitudes of updates to the more sensitive parameters so as to prevent the server model from drifting to any one client distribution, and conversely boosts updates to the less sensitive parameters to better explore different client distributions. Empirical results on real and synthetic data as well as analytical results show that elastic aggregation leads to efficient training in both convex and non-convex settings, while being fully agnostic to client heterogeneity and robust to large numbers of clients, partial participation, and imbalanced data. Finally, elastic aggregation works well with other federated optimizers and achieves significant improvements across the board.
abstractTranslation: 联邦学习可以使用跨分布式客户端的真实数据对神经网络模型进行隐私保护训练。FedAvg 因其简单有效而成为联邦学习的首选优化器。FedAvg 使用朴素聚合来更新服务器模型,根据训练中使用的实例数量对客户端模型进行插值。然而,当数据是异构的(非独立同分布)时,朴素聚合会受到客户端漂移的影响,从而导致不稳定和缓慢的收敛。在这项工作中,我们提出了一种新的聚合方法,即弹性聚合,以克服这些问题。弹性聚合根据参数敏感度自适应地插值客户端模型,参数敏感度通过计算每个参数变化时整体预测函数输出的变化量来衡量。该测量以无人监督的在线方式进行。弹性聚合减少了对更敏感参数的更新幅度,以防止服务器模型漂移到任何一个客户端分布,并反过来增加对不太敏感参数的更新,以更好地探索不同的客户端分布。真实数据和合成数据的实证结果以及分析结果表明,弹性聚合可以在凸和非凸设置中实现高效训练,同时完全不可知客户端异质性,并且对大量客户端、部分参与和不平衡数据具有鲁棒性.最后,弹性聚合与其他联邦优化器配合良好,并在各方面实现了显着改进。
作者: 白小鱼(上海交通大学计算机系博士生)
往期推荐
2.笔记分享|浙大暑期密码学课程:通用可组合性(1)文章速览 | 联邦学习 x CVPR'2023 (上)4.课程报名丨山东大学网络空间安全学院隐私计算讲习暑期课程